Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from cuda-memcheck to compute-sanitizer #289

Merged
merged 4 commits into from
Jul 18, 2024

Conversation

trevilo
Copy link
Contributor

@trevilo trevilo commented Jul 17, 2024

After OS upgrade on CI system, cuda-memcheck fails to detect the known errors in badcuda.cpp, which are based on examples from the nvidia docs:

https://docs.nvidia.com/cuda/archive/11.4.1/cuda-memcheck/index.html#example-use-of-memcheck

It isn't clear why this is the case, but since cuda-memcheck is deprecated in favor of compute-sanitizer in later versions anyway, I am switching to compute-sanitizer. It successfully detects errors in badcuda.

trevilo added 2 commits July 17, 2024 16:52
After OS upgrade on CI system, cuda-memcheck fails to detect the known
errors in badcuda.cpp, which are based on examples from the nvidia
docs:

https://docs.nvidia.com/cuda/archive/11.4.1/cuda-memcheck/index.html#example-use-of-memcheck

It isn't clear why this is the case, but since cuda-memcheck is
deprecated in favor of compute-sanitizer in later versions anyway, I
am switching to compute-sanitizer.  It successfully detects errors in
badcuda.
@trevilo
Copy link
Contributor Author

trevilo commented Jul 18, 2024

PR also now includes a small update the the gitlab CI control, which was necessary because of changes to GitLab's runner authentication-token framework.

trevilo added 2 commits July 18, 2024 09:59
For some reason that I don't understand, lomach-flow.test 5 (i.e.,
"verify consistent serialized restart for Tomboulides with lid-driven
cavity with 2 mpi ranks") consistently fails on one of our testing
systems that uses flux b/c it thinks that the restart file written by
the solver does not exist when the test script attempts to mv the
file (the mv on line 133 of this version).

This commit, which adds some steps between the tps solve and the mv,
appears to avoid the issue.
@trevilo trevilo merged commit 90f4346 into main Jul 18, 2024
18 checks passed
@trevilo trevilo deleted the switch-to-compute-sanitizer branch July 19, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant